Minimum Bayes-Risk Word Alignments of Bilingual Texts
نویسندگان
چکیده
We present Minimum Bayes-Risk word alignment for machine translation. This statistical, model-based approach attempts to minimize the expected risk of alignment errors under loss functions that measure alignment quality. We describe various loss functions, including some that incorporate linguistic analysis as can be obtained from parse trees, and show that these approaches can improve alignments of the English-French Hansards.
منابع مشابه
Minimum Bayes-Risk Decoding for Statistical Machine Translation
We present Minimum Bayes-Risk (MBR) decoding for statistical machine translation. This statistical approach aims to minimize expected loss of translation errors under loss functions that measure translation performance. We describe a hierarchy of loss functions that incorporate different levels of linguistic information from word strings, word-to-word alignments from an MT system, and syntactic...
متن کاملInducing Translation Grammars from Bracketed Alignments
This paper presents an algorithm for generating and ltering an invertible and structural analogous translation grammar from bilingual aligned and linguistically bracketed texts. The algorithm is discussed in general terms and applied to German-English alignments. It is shown that the induction of structural analogous translation grammars can lead to disambiguation of meaning and correction of b...
متن کاملBuilding A Training Corpus For Word Sense Disambiguation In English-To-Vietnamese Machine Translation
The most difficult task in machine translation is the elimination of ambiguity in human languages. A certain word in English as well as Vietnamese often has different meanings which depend on their syntactical position in the sentence and the actual context. In order to solve this ambiguation, formerly, people used to resort to many hand-coded rules. Nevertheless, manually building these rules ...
متن کاملAn Algorithm for Simultaneously Bracketing Parallel Texts by Aligning Words
We describe a grammarless method for simultaneously bracketing both halves of a parallel text and giving word alignments, assuming only a translation lexicon for the language pair. We introduce inversion-invariant transduction grammars which serve as generative models for parallel bilingual sentences with weak order constraints. Focusing on transduction grammars for bracketing, we formulate a n...
متن کاملCombining Multiple Alignments to Improve Machine Translation
Word alignment is a critical component of machine translation systems. Various methods for word alignment have been proposed, and different models can produce significantly different outputs. To exploit the advantages of different models, we propose three ways to combine multiple alignments for machine translation: (1) alignment selection, a novel method to select an alignment with the least ex...
متن کامل